Image data transmission device, image data transmission method, image data reception device, and image data reception method

ABSTRACT

A reception side is configured to normally perform a cutout process appropriately based on cropping information. 
     A container of a predetermined format having a video stream in which the cropping information is inserted into a header portion, for example, a transport stream, is transmitted. Interpretation information of a parameter value of the cropping information is inserted into a high-order layer of the video stream. Even when image data is one of 2-dimensional image data and stereoscopic image data of a frame-compatible scheme, the reception side can appropriately interpret the cropping information based on the interpretation information. Accordingly, it is possible to appropriately perform the cutout process (cropping) based on the cropping information and correctly generate display image data.

TECHNICAL FIELD

The present technology relates to an image data transmission device, animage data transmission method, an image data reception device, and animage data reception method, and more particularly, to an image datatransmission device of an image transmission and reception system inwhich a transmission side transmits cropping information in addition toimage data and a reception side performs a cutout process on the imagedata based on the cropping information.

BACKGROUND ART

For example, PTL 1 suggests a transmission scheme using televisionairwaves of stereoscopic image data. In this case, the stereoscopicimage data including left-eye image data and right-eye image data istransmitted and stereoscopic image display is performed using binoculardisparity in a television receiver.

FIG. 22 is a diagram illustrating a relation between the displaypositions of left and right images of an object (body) on a screen and areproduction position of its stereoscopic image (3D image) whenstereoscopic image display is performed using binocular disparity. Forexample, since left and right lines of sight intersect with each otherin front of the screen surface in regard to an object A displayed on ascreen in such a manner that a left image La is deviated to the rightside and a right image Ra is deviated to the left side on the screen, asillustrated, the reproduction position of its stereoscopic image islocated in front of the screen surface. DPa indicates a parallax vectorin the horizontal direction in regard to the object A.

For example, since left and right lines of sight intersect with eachother on the screen surface in regard to an object B of which a leftimage Lb and a right image Rb are displayed at the same position on thescreen, as illustrated, the reproduction position of its stereoscopicimage is on the screen surface. For example, since left and right linesof sight intersect with each other in the rear of the screen surface inregard to an object C displayed on the screen in such a manner a leftimage Lc is deviated to the left side and a right image Rc is deviatedto the right side on the screen, as illustrated, the reproductionposition of its stereoscopic image is located in the rear of the screensurface. DPc indicates a parallax vector in the horizontal direction inregard to the object C.

In the past, frame-compatible schemes such as a side by side scheme anda top and bottom scheme have been known as a transmission format ofstereoscopic image data. For example, FIG. 23( a) is a diagramillustrating the side by side scheme and FIG. 23( b) is a diagramillustrating the top and bottom scheme. Here, a case of a pixel formatof 1920×1080 is illustrated.

The side by side scheme is a scheme of transmitting pixel data ofleft-eye image data in the first half in the horizontal direction andtransmitting pixel data of right-eye image data in the second half inthe horizontal direction, as illustrated in FIG. 23( a). In the case ofthis scheme, the pixel data in the horizontal direction in each of theleft-eye image data and the right-eye image data is thinned out to ½ anda horizontal resolution is thus a half of the original signal.

As illustrated in FIG. 23( b), the top and bottom scheme is a scheme oftransmitting data of each line of left-eye image data in the first halfin the vertical direction and transmitting data of each line ofright-eye image data in the second half in the vertical direction. Inthe case of this scheme, the lines of the left-eye image data and theright-eye image data are thinned out to ½ and a vertical resolution is ahalf of the original signal.

Hereinafter, a process of generating display image data on the receptionside will be simply described. FIG. 24( a) schematically illustrates aprocess relevant to two-dimensional image data with a pixel format of1920×1080. In this case, since encoding is performed on each block of16×16 on the transmission side, 8 lines formed from blank data are addedand the encoding is performed to obtain image data of 1920 pixels×1088lines.

Therefore, image data of 1920 pixels×1088 lines can be obtained on thereception side after decoding. However, since the 8 lines in the imagedata are the blank data, the image data of 1920 pixels×1080 linesincluding actual image data is cut out based on cropping informationincluded in a video data stream and display image data for atwo-dimensional television receiver (hereinafter, appropriately referredto as a “2D TV”) is generated.

FIG. 24( b) is a diagram schematically illustrating a process relevantto stereoscopic image data (3-dimensional image data) of the side byside scheme with a pixel format of 1920×1080. Even in this case, sincethe encoding is performed on each block of 16×16 on the transmissionside, 8 lines formed from blank data are added and the encoding isperformed to obtain image data of 1920 pixels×1088 lines.

Therefore, image data of 1920 pixels×1088 lines can be obtained on thereception side after decoding. However, since the 8 lines in the imagedata are the blank data, the image data of 1920 pixels×1080 linesincluding actual image data is cut out based on cropping informationincluded in a video data stream. Then, the image data is halved intoleft and right data, a scaling process is performed on each data, andleft-eye display image data and right-eye display image data of astereoscopic television receiver (hereinafter, appropriately referred toas a “3D TV”) are generated.

FIG. 24( c) is a diagram schematically illustrating a process relevantto stereoscopic image data (3-dimensional image data) of the top andbottom scheme with a pixel format of 1920×1080. Even in this case, sincethe encoding is performed on each block of 16×16 on the transmissionside, 8 lines formed from blank data are added and the encoding isperformed to obtain image data of 1920 pixels×1088 lines.

Therefore, image data of 1920 pixels×1088 lines can be obtained on thereception side after decoding. However, since the 8 lines in the imagedata are the blank data, the image data of 1920 pixels×1080 linesincluding actual image data is cut out based on cropping informationincluded in a video data stream. Then, the image data is halved into topand bottom data, a scaling process is performed on each data, andleft-eye display image data and right-eye display image data of a 3D TVare generated.

CITATION LIST Patent Literature

-   PTL 1: Japanese Unexamined Patent Application Publication No.    2005-6114

SUMMARY OF INVENTION Technical Problem

When image data of 1920 pixels×1080 lines is cut out and display imagedata for the 2D TV is generated in the 2D TV in a case of stereoscopicimage data of the side by side scheme or the top and bottom schemedescribed above, an unnatural image in which left and right identicalimages or top and bottom identical images are arranged is displayed.

Accordingly, in order to prevent the unnatural image from beingdisplayed in the 2D TV, the cropping information included in the videodata stream can be considered to be set as information used to cut outonly one of the left-eye image data and the right-eye image data, forexample, only the left-eye image data. In this case, a process of the 2DTV and the 3D TV is performed as follows.

FIG. 25( a) is a diagram schematically illustrating a process on thestereoscopic image data (3-dimensional image data) of the side by sidescheme with the pixel format of 1920×1080 in the 2D TV. In the 2D TV,image data of 1920 pixels×1088 lines can be obtained after the decoding,but 8 lines in the image data are blank data. In this case, based on thecropping information, left-eye image data of 960 pixels×1080 lines iscut out from the image data of 1920 pixels×1080 lines including actualimage data. Then, a scaling process is performed on the left-eye imagedata to generate display image data for the 2D TV. In this case, correct2-dimensional display (2D display) is performed.

On the other hand, FIG. 25( b) is a diagram schematically illustrating aprocess on stereoscopic image data (3-dimensional image data) of theside by side scheme with the pixel format of 1920×1080 in the 3D TV.Even in the 3D TV, image data of 1920 pixels×1088 lines can be obtainedafter the decoding, but 8 lines in the image data are blank data. Inthis case, based on the cropping information, left-eye image data of 960pixels×1080 lines is cut out from the image data of 1920 pixels×1080lines including actual image data.

Then, a scaling process is performed on the left-eye image data togenerate image data of 1920 pixels×1080 lines. This image data is thesame as the above-described display image data of the 2D TV. Since theside by side scheme is used in the 3D TV, the image data is halved intoleft and right data and the scaling process is performed on each of theimage data to generate the left-eye display image data and the right-eyedisplay image data for the 3D TV. In this case, since a left-eye imageand a right-eye image are one and the other of the left and right imageshalved from one image, respectively, correct stereoscopic display (3Ddisplay) is not performed.

FIG. 26( a) is a diagram schematically illustrating a process onstereoscopic image data (3-dimensional image data) of the top and bottomscheme with the pixel format of 1920×1080 in the 2D TV. In the 2D TV,image data of 1920 pixels×1088 lines can be obtained after the decoding,but 8 lines in the image data are blank data. In this case, based on thecropping information, left-eye image data of 1920 pixels×540 lines iscut out from the image data of 1920 pixels×1080 lines including actualimage data. Then, a scaling process is performed on the left-eye imagedata to generate display image data for the 2D TV. In this case, thecorrect 2-dimensional display (2D display) is performed.

On the other hand, FIG. 26( b) is a diagram schematically illustrating aprocess on stereoscopic image data (3-dimensional image data) of the topand bottom scheme with the pixel format of 1920×1080 in the 3D TV. Inthe 3D TV, image data of 1920 pixels×1088 lines can be obtained afterthe decoding, but 8 lines in the image data are blank data. In thiscase, based on the cropping information, left-eye image data of 1920pixels×540 lines is cut out from the image data of 1920 pixels×1080lines including actual image data.

Then, a scaling process is performed on the left-eye image data togenerate image data of 1920 pixels×1080 lines. This image data is thesame as the above-described display image data of the 2D TV. Since thetop and bottom scheme is used in the 3D TV, the image data is halvedinto top and bottom data and the scaling process is performed on each ofthe image data to generate the left-eye display image data and theright-eye display image data for the 3D TV. In this case, since aleft-eye image and a right-eye image are one and the other of the topand bottom images halved from one image, respectively, correctstereoscopic display (3D display) is not performed.

An object of the present technology is to appropriately perform a cutoutprocess based on cropping information on a reception side and correctlybe able to generate display image data.

Solution to Problem

According to a concept of the present technology, an image datatransmission device includes:

an image data transmission unit that transmits a container of apredetermined format having a video stream which includes image data andin which cropping information is inserted into a header portion; and

an information insertion unit that inserts interpretation information ofa parameter value of the cropping information into a high-order layer ofthe video stream.

In the present technology, the image data transmission unit transmitsthe container of the predetermined format having the video stream whichincludes the image data and in which the cropping information isinserted into the header portion. For example, the container may be atransport stream (MPEG-2TS) used in a digital broadcast standard. Forexample, the container may be a container of MP4 or another format used,for example, in delivery of the Internet.

The information insertion unit inserts the interpretation information ofthe parameter value of the cropping information into the high-orderlayer of the video stream. For example, the container may be a transportstream and the information insertion unit may insert the interpretationinformation under a program map table or an event information table. Forexample, the information insertion unit may describe the interpretationinformation in a descriptor inserted under the program map table or theevent information table.

For example, the video stream is encoded data of H.264/AVC or HEVC. Thecropping information may be defined in a sequence parameter set of thevideo stream. The information insertion unit may describe theinterpretation information in the descriptor inserted under the programmap table or the event information table.

For example, when the image data is stereoscopic image data in whichleft-eye image data and right-eye image data are divided and arranged inthe horizontal direction or the vertical direction in the same frame,that is, so-called stereoscopic image data of a frame-compatible scheme,the interpretation information is considered to indicate that theparameter value of the cropping information is specially interpreted. Inthis case, when the image data is 2-dimensional image data, theinterpretation information is considered to indicate that the parametervalue of the cropping information is interpreted without change.

For example, when the image data is stereoscopic image data in whichleft-eye image data and right-eye image data are divided and arranged inthe horizontal direction or the vertical direction in the same frame,the interpretation information may indicate that the parameter value ofthe cropping information is interpreted such that a cropping region isdoubled in the horizontal direction or the vertical direction. Forexample, when the image data is stereoscopic image data of the side byside scheme, the interpretation information indicates that the parametervalue is interpreted such that a cropping region is doubled in thehorizontal direction. For example, when the image data is stereoscopicimage data of the top and bottom scheme, the interpretation informationindicates that the parameter value is interpreted such that a croppingregion is doubled in the vertical direction. In this case, theinterpretation information designates the interpretation of theparameter value of the cropping information.

In the present technology, the interpretation information of theparameter value of the cropping information is inserted into thehigh-order layer of the video stream. Therefore, even when the imagedata is any one of the 2-dimensional image data and the stereoscopicimage data of the frame-compatible scheme, the reception side canappropriately interpret the parameter value of the cropping informationbased on the interpretation information. Accordingly, it is possible toappropriately perform the cutout process (cropping) based on thecropping information and correctly generate display image data.

In the present technology, for example, the image data may be the2-dimensional image data or the stereoscopic image data in whichleft-eye image data and right-eye image data are divided and arranged inthe horizontal direction or the vertical direction in the same frame.The information insertion portion may be configured to insert theinterpretation information changed according to the switched image datainto the high-order layer of the video stream at a timing prior to aswitching timing of the 2-dimensional image data and the stereoscopicimage data.

In this case, the reception side can acquire the interpretationinformation changed according to the switched image data before theswitching timing of the 2-dimensional image data and the stereoscopicimage data. Accordingly, the image data cutout process (cropping) can beperformed by the interpretation of the parameter value of the croppinginformation suitable for the switched image data immediately from theswitching timing. Thus, it is possible to prevent an unnatural imagefrom being displayed due to the switching of the image data.

According to another concept of the present technology, an image datareception device includes

an image data reception unit that receives a container of apredetermined format having a video stream which includes image data andin which cropping information is inserted into a header portion.

Interpretation information of a parameter value of the croppinginformation is inserted into a high-order layer of the video stream.

The image data reception device further includes

an information acquisition unit that acquires the interpretationinformation from the container;

a decoding unit that decodes the video stream included in the containerto acquire the image data and the cropping information;

and an image data processing unit that interprets the parameter value ofthe cropping information based on the interpretation information andcuts out image data of a predetermined region from the image data togenerate display image data.

In the present technology, the image data reception unit receives thecontainer of the predetermined format having the video stream whichincludes image data and in which the cropping information is insertedinto the header portion, for example, the transport stream. Here, theinterpretation information of the parameter value of the croppinginformation is inserted into the high-order layer of the video stream.

The information acquisition unit acquires the interpretation informationfrom the container. The decoding unit decodes the video stream includedin the container and acquires the image data and the croppinginformation. The image data processing unit interprets the parametervalue of the cropping information based on the interpretationinformation and cuts the image data of the predetermined region from theimage data to generate the display image data.

Thus, in the present technology, the container of the predeterminedformat having the video stream in which the cropping information isinserted into the header portion is received. However, theinterpretation information of the cropping information is inserted intothe high-order layer of the video stream. Therefore, even when the imagedata is any one of the 2-dimensional image data and the stereoscopicimage data of the frame-compatible scheme, the cropping information canappropriately be interpreted based on the interpretation information.Accordingly, it is possible to appropriately perform the cutout processbased on the cropping information and correctly generate the displayimage data.

In the present technology, for example, the image data may be any one ofthe 2-dimensional image data and the stereoscopic image data in whichleft-eye image data and right-eye image data are divided and arranged inthe horizontal direction or the vertical direction in the same frame. Ata timing prior to a switching timing of the two-dimensional image dataand the stereoscopic image data, the interpretation information changedaccording to the switched image data may be inserted into the high-orderlayer of the video stream. From the switching timing of the image data,the image data processing unit may interpret the parameter value of thecropping information based on the interpretation information inserted ata timing prior to the switching timing and changed according to theswitched image data.

In this case, the image data cutout process can appropriately beperformed by the interpretation of the parameter value of the croppinginformation suitable for the switched image data immediately from theswitching timing. Thus, even when the acquisition of the interpretationinformation is not synchronized with the switching timing of the imagedata, it is possible to prevent an unnatural image from being displayed.

Advantageous Effects of Invention

According to the present technology, it is possible to appropriatelyperform the cutout process based on the cropping information on thereception side and correctly generate the display image data.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a block diagram illustrating an example of the configurationof an image transmission and reception system according to anembodiment.

FIG. 2 is a diagram illustrating an example of the data structure of anaccess unit in a video stream.

FIG. 3 is a diagram illustrating the structure of cropping informationdefined in an SPS (Sequence Parameter Set) of the access unit.

FIG. 4 is a diagram schematically illustrating a process of receivingstereoscopic image data of a side by side scheme with a pixel format of1920×1080.

FIG. 5 is a diagram schematically illustrating a process of receivingstereoscopic image data of a top and bottom scheme with a pixel formatof 1920×1080.

FIG. 6 is a block diagram illustrating an example of the configurationof a transmission data generation unit of a broadcast station includedin an image transmission and reception system.

FIG. 7 is a diagram illustrating an example of the configuration of atransport stream TS.

FIG. 8 is a diagram illustrating an example of another configuration ofa transport stream TS.

FIG. 9 is a diagram illustrating an exemplary configuration (Syntax) ofan “AVC_video_descriptor.”

FIG. 10 is a diagram illustrating regulation contents (Semantics) of the“AVC_video_descriptor.”

FIG. 11 is a diagram illustrating an exemplary configuration (Syntax) ofa “Cropping_interpretation_descriptor.”

FIG. 12 is a block diagram illustrating an example of the configurationof a receiver included in the image transmission and reception system.

FIG. 13 is a flowchart illustrating an example of a cropping controlprocess of a CPU in the receiver.

FIG. 14 is a diagram illustrating an example of flag information of a“cropping_normal_interpretation_flag” described in an AVC videodescriptor under a PMT at the time of an operation.

FIG. 15 is a diagram illustrating an example of the configuration of atransport stream TS.

FIG. 16 is a diagram illustrating an example of another configuration ofa transport stream TS.

FIG. 17 is a diagram illustrating an exemplary configuration (Syntax) ofan “AVC_video_descriptor.”

FIG. 18 is a diagram illustrating regulation contents (Semantics) of the“AVC_video_descriptor.”

FIG. 19 is a diagram illustrating an exemplary configuration (Syntax) ofa “Cropping_interpretation_descriptor.”

FIG. 20 is a flowchart illustrating an example of a cropping controlprocess of a CPU in the receiver.

FIG. 21 is a diagram illustrating an example of mode information at thetime of an operation in a “cropping_interpretation_mode” described in anAVC video descriptor under a PMT.

FIG. 22 is a diagram illustrating a relation between the displaypositions of right and left images of an object on a screen and areproduction position of its stereoscopic image when stereoscopic imagedisplay is performed using binocular disparity.

FIG. 23 is a diagram illustrating examples (a side by side scheme and atop and bottom scheme) of a transmission format of stereoscopic imagedata.

FIG. 24 is a diagram illustrating a process of generating display imagedata on a reception side.

FIG. 25 is a diagram illustrating image processing in the side by sidescheme of using cropping information according to the related art.

FIG. 26 is a diagram illustrating image processing in the top and bottomscheme of using cropping information according to the related art.

DESCRIPTION OF EMBODIMENTS

Hereinafter, a mode (hereinafter, referred to as an “embodiment”) forcarrying out the invention will be described. The description will bemade in the following order.

1. Embodiment 2. Modification Examples 1. Embodiment [Image Transmissionand Reception System]

FIG. 1 is a diagram illustrating an example of the configuration of animage transmission and reception system 10 according to an embodiment.The image transmission and reception system 10 includes a broadcaststation 100 and a receiver (3D TV) 200. The broadcast station 100 loadsa transport stream TS having a video stream that includes image data onan airwave to transmit the transport stream TS.

The image data included in the video stream is 2-dimensional image dataor stereoscopic image data of a so-called frame-compatible scheme inwhich left-eye image data and right-eye image data are divided andarranged in the horizontal direction or the vertical direction in thesame frame. Examples of the transmission format of the stereoscopicimage data include a side by side method (see FIG. 23( a)) and a top andbottom scheme (see FIG. 23( b)).

In this embodiment, a pixel format of the image data is assumed to be1920×1080. The broadcast station 100 performs encoding on the image datafor each block of 16×16. Therefore, the broadcast station 100 adds 8lines formed from blank data and performs the encoding to obtain theimage data of 1920 pixels×1088 lines.

Cropping information is inserted into a header portion of the videostream. When the image data is 2-dimensional image data, the croppinginformation serves as information that is used to cut out image data of1920 pixels×1080 lines including actual image data from the decodedimage data of 1920 pixels×1088 lines.

The cropping information serves as information that is used to cut outactual left-eye image data or actual right-eye image data from thedecoded image data of 1920 pixels×1088 lines when the image data isstereoscopic image data of the frame-compatible scheme. For example, instereoscopic image data of the side by side scheme, the croppinginformation serves as information that is used to cut out image data of960 pixels×1080 lines. Further, for example, in stereoscopic image dataof the top and bottom scheme, the cropping information serves asinformation that is used to cut out image data of 1920 pixels×540 lines.

In this embodiment, the video data stream is, for example, an H.264/AVC(Advanced Video Coding) stream. The cropping information is defined in asequence parameter set (SPS) of the video stream. FIGS. 2( a) and 2(b)are diagrams illustrating examples of the data structures of accessunits in the video data stream. H.264 defines a picture as a unit calledan access unit. FIG. 2( a) is a diagram illustrating the structure ofthe head access unit of a GOP (Group Of Pictures). FIG. 2( b) is adiagram illustrating the structure of the access unit other than thehead access unit of the GOP.

The cropping information is inserted into a portion of an SPS (SequenceParameter Set) present in the head access unit of the GOP. FIG. 3 is adiagram illustrating the structure (Syntax) of the cropping informationdefined in the SPS. In the SPS, whether the cropping information ispresent is indicated by flag information of “frame_cropping_flag.” Thecropping information is information that designates a rectangular regionas a cutout region of the image data.

“frame_crop_left_offset” indicates a start position in the horizontaldirection, that is, a left end position. “frame_crop_right_offset”indicates an end position in the horizontal direction, that is, a rightend position. “frame_crop_top_offset” indicates a start position in thevertical direction, that is, a top end position.“frame_crop_bottom_offset” indicates an end position in the verticaldirection, that is, a bottom end position. All are expressed by offsetvalues from the left and top position.

When the image data is stereoscopic image data, “Frame PackingArrangement SEI message” is inserted into the portion of the SEIs of theaccess unit. The SEI includes type information indicating whichtransmission format of stereoscopic image data the image data has.

In the transport stream TS, interpretation information of a parametervalue of the cropping information is inserted into a high-order layer ofthe video stream. This interpretation information is inserted under, forexample, a program map table (PMT). Specifically, for example, thisinterpretation information is described in a descriptor that is insertedunder a video elementary loop of the program map table. The descriptoris, for example, a known AVC video descriptor or a newly definedcropping interpretation descriptor (Cropping_interpretation_descriptor).

When the image data is stereoscopic image data of a frame-compatiblescheme, the interpretation information indicates that a parameter valueof the cropping information is specially interpreted. Further, when theimage data is 2-dimensional image data, the interpretation informationindicates that a parameter value of the cropping information has to beinterpreted without change. The interpretation information is insertedat a timing prior to a switching timing of the 2-dimensional image dataand the stereoscopic image data.

The receiver 200 receives the transport stream TS loaded on the airwavesand transmitted from the broadcast station 100. The receiver 200acquires the interpretation information of the parameter value of thecropping information inserted into the high-order layer of the videostream, as described above, from the transport stream TS. Further, thereceiver 200 decodes the video stream and acquires the image data andthe cropping information.

The receiver 200 interprets the parameter value of the croppinginformation based on the interpretation information, cuts out image dataof a predetermined region, and generates display image data from theimage data. For example, when the image data is 2-dimensional imagedata, the cropping information serves as information that is used to cutout image data of 1920 pixels×1080 lines including actual image datafrom the decoded image data of 1920 pixels×1088 lines. In this case, thereceiver 200 interprets the parameter value of the cropping informationwithout change, cuts out the image data of 1920 pixels×1080 linesincluding actual image data from the decoded image data of 1920pixels×1088 lines, and generates image data of 2-dimensional imagedisplay.

For example, when the image data is stereoscopic image data of theframe-compatible scheme, the cropping information serves as informationthat is used to cut out actual left-eye image data or actual right-eyeimage data from the decoded image data of 1920 pixels×1088 lines. Inthis case, the receiver 200 interprets the parameter value of thecropping information such that a cropping region is doubled in thehorizontal direction or the vertical direction. Then, the receiver 200cuts out the image data of 1920 pixels×1080 lines including actual imagedata from the decoded image data of 1920 pixels×1088 lines, performs ascaling process on each of left-eye and right-eye image data portions,and generates left-eye image data and right-eye image data forstereoscopic image display.

As described above, the interpretation information is inserted at atiming prior to a switching timing of the 2-dimensional image data andthe stereoscopic image data. From the switching timing of the imagedata, the receiver 200 interprets the parameter value of the croppinginformation based on the interpretation information inserted at thetiming prior to the switching timing and changed according to theswitched image data. That is, the receiver 200 cuts out the image databy the interpretation of the cropping information suitable for theswitched image data immediately from the switching timing and generatesthe display image data.

FIG. 4 is a diagram schematically illustrating a process of receivingthe stereoscopic image data of the side by side scheme in the pixelformat of 1920×1080. After the decoding, the image data of 1920pixels×1088 lines can be obtained, but 8 lines in the image data areblank data.

In a case of a 2-dimensional (2D) display mode, the cropping information(in which an offset position is indicated by a white O mark) isinterpreted without change. Therefore, based on the croppinginformation, for example, left-eye image data of 960 pixels×1080 linesis cut out from the image data of 1920 pixels×1080 lines including theactual image data. Then, the scaling process is performed on theleft-eye image data in the horizontal direction to generate image datafor 2-dimensional image display. In this case, a 2-dimensional image iscorrectly displayed.

In a case of a stereoscopic (3D) display mode, the cropping information(in which an offset position is indicated by a white O mark) isinterpreted such that a cropping region is doubled in the horizontaldirection (where an offset change position is indicated by a hatchedmark O). Therefore, based on the cropping information, the image data of1920 pixels×1080 lines including the actual image data is cut out. Thecut image data is halved into left and right images from thestereoscopic image data of the side by side scheme and the scalingprocess is performed in the horizontal direction to generate theleft-eye image data and the right-eye image data for the stereoscopicimage data. In this case, a stereoscopic image is correctly displayed.

FIG. 5 is a diagram schematically illustrating a process of receivingstereoscopic image data of the top and bottom scheme in the pixel formatof 1920×1080. After the decoding, the image data of 1920 pixels×1088lines can be obtained, but 8 lines in the image data are blank data.

In the case of the 2-dimensional (2D) display mode, the croppinginformation (in which an offset position is indicated by a white O mark)is interpreted without change. Therefore, based on the croppinginformation, for example, left-eye image data of 1920 pixels×540 linesis cut out from the image data of 1920 pixels×1080 lines including theactual image data. Then, the scaling process is performed on theleft-eye image data in the vertical direction to generate image data for2-dimensional image display. In this case, a 2-dimensional image iscorrectly displayed.

In the case of the stereoscopic (3D) display mode, the croppinginformation (in which an offset position is indicated by a white O mark)is interpreted such that a cropping region is doubled in the verticaldirection (where an offset change position is indicated by a hatchedmark O). Therefore, based on the cropping information, the image data of1920 pixels×1080 lines including the actual image data is cut out. Thecut image data is halved into top and bottom images from thestereoscopic image data of the top and bottom scheme and the scalingprocess is performed in the vertical direction to generate the left-eyeimage data and the right-eye image data for the stereoscopic imagedisplay. In this case, a stereoscopic image is correctly displayed.

[Example of Configuration of Transmission Data Generation Unit]

FIG. 6 is a diagram illustrating an example of the configuration of atransmission data generation unit 110 that generates the above-describedtransport stream TS in the broadcast station 100. The transmission datageneration unit 110 includes a data extraction unit (achieving unit)111, a video encoder 112, an audio encoder 113, and a multiplexer 114.

For example, a data recording medium 111 a is detachably mounted on thedata extraction unit 111. The data recording medium 111 a is, forexample, a disc-form recording medium or a semiconductor memory. Thedata recording medium 111 a records image data of a plurality ofprograms transmitted by the transport stream TS.

The image data of each program is configured as, for example,2-dimensional image data or stereoscopic image data (hereinafter, simplyreferred to as “stereoscopic image data”) of the frame-compatiblescheme. The transmission format of the stereoscopic image data is, forexample, the side by side scheme or the top and bottom scheme (see FIGS.23( a) and 23(b)). The data extraction unit 111 sequentially extractsand outputs image data and audio data of transmission target programsfrom the data recording medium 111 a.

The video encoder 112 performs encoding of H.264/AVC (Advanced VideoCoding) on the image data output from the data extraction unit 111 toobtain encoded video data. In the video encoder 112, a stream formatter(not illustrated) provided on a rear stage generates a video stream(video elementary stream) including the encoded video data. At thistime, the video encoder 112 inserts the cropping information into theheader portion of the video stream. As described above, the croppinginformation is inserted into a portion of the SPS (Sequence ParameterSet) present in the head access unit of the GOP (see FIG. 2( a)).

The audio encoder 113 performs encoding of MPEG-2 Audio AAC or the likeon the audio data output from the data extraction unit 111 to generatean audio stream (audio elementary stream). The multiplexer 114 packetsand multiplexes each of the elementary streams generated by the videoencoder 112 and the audio encoder 113 to generate the transport stream(multiplexed data stream) TS.

Here, the multiplexer 114 inserts the interpretation information of theparameter value of the cropping information into the high-order layer ofthe video stream. The multiplexer 114 inserts the interpretationinformation corresponding to the switched image data at a timing priorto the switching timing of the 2-dimensional image data and thestereoscopic image data.

As described above, for example, the interpretation information isdescribed in the descriptor inserted under the video elementary loop ofthe program map table. The descriptor is, for example, a known AVC videodescriptor or a newly defined cropping interpretation descriptor(Cropping_interpretation_descriptor).

FIG. 7 is a diagram illustrating an example of the configuration of thetransport stream TS. The example of the configuration is an example inwhich flag information of “cropping_normal_interpretation_flag” servingas the interpretation information of the parameter value of the croppinginformation is described in the known AVC video descriptor.

In the example of the configuration, a PES packet, “Video PES1,” of thevideo stream is included. In the video stream, when the included imagedata is stereoscopic image data, “Frame Packing Arrangement SEI message”is inserted into a portion of the SEIs of the access unit, as describedabove. The SEI includes the type information indicating whichtransmission format of stereoscopic image data the image data has.

The transport stream TS includes a PMT (Program Map Table) as PSI(Program Specific Information). The PSI is information describing towhich program each elementary stream included in the transport streambelongs. The transport stream further includes an EIT (Event InformationTable) as SI (Serviced Information) used to manage an event unit.

In the PMT, there is a program descriptor (Program Descriptor)describing information regarding the entire program. In the PMT, thereis an elementary loop having information regarding each elementarystream. In the example of the configuration, there is a video elementaryloop (Video ES loop).

In the elementary loop, information such as a packet identifier (PID) isarranged for each stream and a descriptor describing informationregarding the elementary stream is also arranged. In the example of theconfiguration, an audio is not illustrated to simplify the drawing.

In the example of the configuration, flag information of“cropping_normal_interpretation_flag” is described in“AVC_video_descriptor” included in the video elementary loop (Video ESloop).

FIG. 8( a) is a diagram illustrating an example of another configurationof the transport stream TS. The example of the configuration is anexample in which flag information of“cropping_normal_interpretation_flag” serving as the interpretationinformation of the parameter value of the cropping information isdescribed in a newly defined cropping interpretation descriptor.

In the example of the configuration, flag information of“cropping_normal_interpretation_flag” is described in“Cropping_interpretation_descriptor” inserted into the video elementaryloop (Video ES loop). Although the detailed description is omitted, theremaining configuration is the same as the example of the configurationillustrated in FIG. 7.

When the interpretation of the parameter value of the croppinginformation is changed at each event,“Cropping_interpretation_descriptor” can be considered to be insertedunder the EIT, as illustrated in FIG. 8( b).

FIG. 9 is a diagram illustrating an example of the structure (Syntax) of“AVC_video_descriptor.” The descriptor itself already satisfies theH.264/AVC standard. Here, 1-bit flag information of“cropping_normal_interpretation_flag” is newly defined in thedescriptor.

As indicated in the regulation contents (semantics) in FIG. 10, the flaginformation indicates whether the parameter value of the croppinginformation defined in the SPS (Sequence Parameter Set) in the headaccess unit of the GOP is applied without change, in other words,whether the parameter value of the cropping information is speciallyinterpreted.

When the flag information is “0,” the flag information indicates thatthe parameter value of the cropping information is speciallyinterpreted. At this time, when(frame_crop_right_offset−frame_crop_left_offset) accords with ½ of thesize (horizontal_size) of the picture in the horizontal direction, thereceiver sets a position at which the cropping is performed bysubstituting the right-hand side into the left-hand side in each of (1)and (2) below and performs the cropping based on the position at whichthe cropping is performed. Further, (1) or (2) can be determineddepending on whether the interpretation value in (1) is within the rangeof the picture size.

frame_crop_right_offset=frame_crop_right_offset*2  (1)

frame_crop_left_offset=0  (2)

At this time, when (frame_crop_bottom_offset−frame_crop_top_offset)accords with ½ of the size (vertical_size) of the picture in thevertical direction, the receiver sets a position at which the croppingis performed by substituting the right-hand side into the left-hand sidein each of (3) and (4) below and performs the cropping based on theposition at which the cropping is performed. Further, (3) or (4) can bedetermined depending on whether the interpretation value in (3) iswithin the range of the picture size.

frame_crop_bottom_offset=frame_crop_bottom_offset  (3)

frame_crop_top_offset=0  (4)

When the flag information is “0” but neither of the above descriptionsapplies, the receiver interprets the parameter value of the croppinginformation defined in the SPS without change and performs the cropping.

When the flag information is “1,” the receiver interprets the parametervalue of the cropping information defined in the SPS without change andperforms the cropping.

FIG. 11 is a diagram illustrating an example of the structure (Syntax)of “Cropping_interpretation_descriptor.” An 8-bit field of“descriptor_tag” indicates that this descriptor is“Cropping_interpretation_descriptor.” An 8-bit field of“descriptor_length” indicates the number of bytes of the subsequentdata. Further, 1-bit flag information of“cropping_normal_interpretation_flag” described above is described inthis descriptor.

A process of the transmission data generation unit 110 illustrated inFIG. 6 will be described in brief. The image data (the 2-dimensionalimage data or the stereoscopic image data) of the programs which aresequentially output from the data extraction unit 111 and which are tobe transmitted are supplied to the video encoder 112. The video encoder112 performs encoding of H.264/AVC (Advanced Video Coding) on the imagedata to obtain encoded video data. In the video encoder 112, the streamformatter (not illustrated) provided on a rear stage generates a videostream (video elementary stream) including the encoded video data.

In this case, the video encoder 112 inserts the cropping informationinto the header portion of the video data stream. That is, in this case,the cropping information is inserted into a portion of the SPS (SequenceParameter Set) present in the head access unit of the GOP (see FIGS. 2and 3). When the image data is the stereoscopic image data, the videoencoder 112 inserts “Frame Packing Arrangement SEI message” into aportion of the SEIs of the access unit (see FIG. 2). The SEI includestype information indicating which transmission format of thestereoscopic image data the image data has.

When the image data of the above-described programs to be transmitted isoutput from the data extraction unit 111, audio data corresponding tothe image data is also output from the data extraction unit 111. Theaudio data is supplied to the audio encoder 113. The audio encoder 113performs encoding of MPEG-2Audio AAC or the like on the audio data togenerate an audio stream (audio elementary stream) including the encodedaudio data.

The video stream generated by the video encoder 112 is supplied to themultiplexer 114. The audio stream generated by the audio encoder 113 isalso supplied to the multiplexer 114. The multiplexer 114 packets andmultiplexes the elementary stream supplied from each encoder to generatea transport stream (multiplexed data stream) TS.

In this case, the multiplexer 114 inserts the interpretation informationof the parameter value of the cropping information into a high-orderlayer of the video data stream. In this case, the interpretationinformation corresponding to the switched image data is inserted at atiming prior to the switching timing of the 2-dimensional image data andthe stereoscopic image data. In this case, the flag information of“cropping_normal_interpretation_flag” serving as the interpretationinformation is described in, for example, the descriptor inserted underthe video elementary loop of the program map table (see FIGS. 7, 8, 9,and 11).

As described above, the transmission data generation unit 110illustrated in FIG. 6 inserts the interpretation information of theparameter value of the cropping information into a high-order layer ofthe video stream. Therefore, even when the image data is any one of the2-dimensional image data and the stereoscopic image data, the receptionside can appropriately interpret the parameter value of the croppinginformation based on the interpretation information, and thus canappropriately perform the cutout process (cropping) based on thecropping information to correctly generate the display image data.

The transmission data generation unit 110 illustrated in FIG. 6 insertsthe interpretation information corresponding to the switched image datainto a high-order layer of the video stream at a timing prior to theswitching timing of the 2-dimensional image data and the stereoscopicimage data. Therefore, the reception side can acquire the interpretationinformation changed according to the switched image data before theswitching timing of the 2-dimensional image data and the stereoscopicimage data. Accordingly, since the image data cutout process (cropping)can be performed by the interpretation of the parameter value of thecropping information suitable for the switched image data immediatelyfrom the switching timing, it is possible to prevent unnatural imagedisplay by the switching of the image data.

[Example of Configuration of Receiver]

FIG. 12 is a diagram illustrating an example of the configuration of thereceiver (3D TV) 200. The receiver 200 includes a CPU 201, a flash ROM202, a DRAM 203, an internal bus 204, a remote control reception unit(RC reception unit) 205, and a remote control transmission unit (RCtransmission unit) 206.

The receiver 200 further includes an antenna terminal 210, a digitaltuner 211, a demultiplexer 213, a video decoder 214, view buffer 217Land 217R, an audio decoder 218, and a channel processing unit 219.

The CPU 201 controls a process of each unit of the receiver 200. Theflash ROM 202 stores control software and stores data. The DRAM 203includes a work area of the CPU 201. The CPU 201 loads software or dataread from the flash ROM 202 on the DRAM 203, activates the software, andcontrols each unit of the receiver 200. The RC reception unit 205receives a remote control signal (remote control code) transmitted fromthe RC transmission unit 206 and supplies the remote control code to theCPU 201. The CPU 201 controls each unit of the receiver 200 based on theremote control code. The CPU 201, the flash ROM 202, and the DRAM 203are connected to the internal bus 204.

The antenna terminal 210 is a terminal that inputs a televisionbroadcast signal received by a reception antenna (not illustrated). Thedigital tuner 211 processes the television broadcast signal input to theantenna terminal 210 and outputs a predetermined transport stream TScorresponding to a user's selected channel.

As described above, the transport stream TS has a video stream includingthe image data, and the cropping information is inserted into the headerportion. Here, the image data is 2-dimensional image data orstereoscopic image data. In the transport stream TS, as described above,the flag information of “cropping_normal_interpretation_flag” serving asthe interpretation information of the parameter value of the croppinginformation is inserted into the high-order layer of the video stream.

As described above, for example, the interpretation information isdescribed in the descriptor inserted under the program map table or anevent information table. The descriptor is, for example, a known AVCvideo descriptor or a newly defined cropping interpretation descriptor.In this case, at a timing prior to the switching timing of the2-dimensional image data and the stereoscopic image data, theinterpretation information corresponding to the switched image data isinserted into the high-order layer of the video stream.

The demultiplexer 213 extracts each stream of the video and the audiofrom the transport stream TS output from the digital tuner 211. Thedemultiplexer 213 extracts information such as the program map table(PMT) from the transport stream TS and supplies this information to theCPU 201.

As described above, this information includes the flag information of“cropping_normal_interpretation_flag” serving as the interpretationinformation of the parameter value of the cropping information. The CPU201 interprets the parameter value of the cropping information based onthe flag information and controls the image data cutout process(cropping) on the decoded image data.

The video decoder 214 performs an inverse process to the process of thevideo encoder 112 of the transmission data generation unit 110 describedabove. That is, the video decoder 214 performs a decoding process on theencoded image data included in the video stream extracted by thedemultiplexer 213 to obtain the decoded image data.

As described above, the transmission data generation unit 110 of thebroadcast station 100 adds 8 lines formed from blank data in order toperform the encoding for each block of 16×16 and performs the encodingto obtain the image data of 1920 pixels×1088 lines. Therefore, the videodecoder 214 acquires, as the decoded image data, the image data of 1920pixels×1088 lines to which the 8 lines formed from the blank data areadded.

The video decoder 214 extracts header information of the video datastream and supplies the header information to the CPU 201. In this case,a portion of the SPS of the head access unit of the GOP includes thecropping information. When the image data is the stereoscopic imagedata, “Frame Packing Arrangement SEI message” including the typeinformation is inserted into a portion of the SEIs of the access unit.The CPU 201 controls the image data cutout process (cropping) on thedecoded image data based on the cropping information and the SEI.

The video decoder 214 performs the image data cutout process (cropping)on the decoded image data under the control of the CPU 201 andappropriately performs the scaling process to generate display imagedata.

The video decoder 214 performs the following process, when the imagedata is 2-dimensional image data. That is, the video decoder 214 cutsout the image data of 1920 pixels×1080 lines including the actual imagedata from the decoded image data of 1920 pixels×1088 lines and generatesimage data SV for 2-dimensional image display.

The video decoder 214 performs the following process, when the imagedata is stereoscopic image data and is in the 2-dimensional displaymode. That is, the video decoder 214 cuts out left-eye image data orright-eye image data from the decoded image data of 1920 pixels×1088lines in the image data of 1920 pixels×1080 lines including the actualimage data. Then, the video decoder 214 performs the scaling process onthe cut image data to generate image data SV for 2-dimensional imagedisplay (see the 2D display mode in FIGS. 4 and 5).

The video decoder 214 performs the following process, when the imagedata is stereoscopic image data and is in the stereoscopic display mode.That is, the video decoder 214 cuts out the image data of 1920pixels×1080 lines including the actual image data from the decoded imagedata of 1920 pixels×1088 lines.

The video decoder 214 halves the cut image data into left and rightimage data or top and bottom image data and performs the scaling processon each of the image data to generate left-eye image data SL andright-eye image data SR for stereoscopic image display (see the 3Ddisplay mode in FIGS. 4 and 5). In this case, when the image data is thestereoscopic image data of the side by side scheme, the image data ishalved into the left and right image data. When the image data is thestereoscopic image data of the top and bottom scheme, the image data ishalved into the top and bottom image data.

The view buffer 217L temporarily accumulates the 2-dimensional imagedata SV or the left-eye image data SL of 1920 pixels×1080 linesgenerated by the video decoder 214 and outputs the 2-dimensional imagedata SV or the left-eye image data SL to an image output unit such as adisplay. Further, the view buffer 217R temporarily accumulates theright-eye image data SR of 1920 pixels×1080 lines generated by the videodecoder 214 and outputs the right-eye image data SR to the image outputunit such as a display.

The audio decoder 218 performs an inverse process to the process of theaudio encoder 113 of the transmission data generation unit 110 describedabove. That is, the audio decoder 218 performs a decoding process on theencoded audio data included in the audio stream extracted by thedemultiplexer 213 to obtain decoded audio data. The channel processingunit 219 processes the audio data obtained from the audio decoder 218 togenerate audio data SA of each channel used to realize, for example, a5.1 ch surround and outputs the audio data SA to an audio output unitsuch as a speaker.

[Cropping Control]

Control of the cropping (image data cutout process) performed in thevideo decoder 214 by the CPU 201 will be described. The CPU 201 performsthe cropping control in the video decoder 214 based on the croppinginformation, the interpretation information of the parameter value, theSEI including the type information of the stereoscopic image data, andthe like.

FIG. 13 is a flowchart illustrating an example of a cropping controlprocess performed by the CPU 201. The CPU 201 performs a process of theflowchart for each picture. The CPU 201 starts the process in step ST1,and then causes the process to proceed to step ST2. In step ST2, the CPU201 determines whether a mode is the 3D display mode. The user operatesthe RC transmission unit 206 to set the 3D display mode or the 2Ddisplay mode.

When the mode is the 3D display mode, in step ST3, the CPU 201determines whether “cropping_normal_interpretation_flag” which is theinterpretation information of the parameter value of the croppinginformation is “0.” This flag information is set to “0,” when the imagedata is the stereoscopic image data and is for a 3D service inconsideration of 2D compatibility.

When the flag information is “0,” in step ST4, the CPU 201 determineswhether the SEI of “Frame Packing Arrangement SEI message” is detected.The SEI is present, when the image data is the stereoscopic image data.When the SEI is detected, in step ST5, the CPU 201 determines whether(frame_crop_right_offset−frame_crop_left_offset) accords with ½ of thesize (horizontal_size) of the picture in the horizontal direction.

When the image data is the stereoscopic image data of the side by sidescheme, the condition of step ST5 is satisfied. Therefore, when thecondition of step ST5 is satisfied, the CPU 201 causes the process toproceed to step ST6. In step ST6, the CPU 201 interprets the croppinginformation and performs a cropping control process such that thecropping region is doubled in the horizontal direction.

In this case, the CPU 201 changes the parameter value of the croppinginformation as follows depending on whether the region cut out based onthe original cropping information is the left half or the right half.That is, when the region is the left half, the interpretation isperformed as “frame_crop_right_offset=frame_crop_right_offset*2” bysubstituting the right-hand side into the left-hand side, and then thecropping control process is performed. Conversely, when the region isthe right half, the interpretation is performed as“frame_crop_left_offset=0” by substituting the right-hand side into theleft-hand side, and then the cropping control process is performed.

The CPU 201 performs the process of step ST6, and then ends the processin step ST7.

Conversely, when the condition of step ST5 is not satisfied, the CPU 201causes the process to proceed to step ST8. In step ST8, the CPU 201determines whether (frame_crop_bottom_offset−frame_crop_top_offset)accords with ½ of the size (vertical_size) of the picture in thevertical direction.

When the image data is the stereoscopic image data of the top and bottomscheme, the condition of step ST8 is satisfied. Therefore, when thecondition of step ST8 is satisfied, the CPU 201 causes the process toproceed to step ST9. In step ST9, the CPU 201 interprets the croppinginformation such that the cropping region is doubled in the verticaldirection and performs the cropping control process.

In this case, the CPU 201 changes the parameter value of the croppinginformation as follows depending on whether the region cut out based onthe original cropping information is the top half or the bottom half.That is, when the region is the top half, the interpretation isperformed as “frame_crop_bottom_offset=frame_crop_bottom_offset*2” bysubstituting the right-hand side into the left-hand side, and then thecropping control process is performed. Conversely, when the region isthe bottom half, the interpretation is performed as“frame_crop_top_offset=0” by substituting the right-hand side into theleft-hand side, and then the cropping control process is performed.

The CPU 201 performs the process of step ST9, and then ends the processin step ST7. Whether the format of the corresponding picture is the sideby side scheme or the top and bottom scheme is, of course, known by“Frame Packing Arrangement SEI.”

When the mode is not the 3D display mode in step ST2, the flaginformation is “1” in step ST3, the SEI is not detected in step ST4, andthe condition of step ST8 is not satisfied, the CPU 201 causes theprocess to proceed to step ST10. In step ST10, the CPU 201 performs thecropping control process without change of the parameter value of thecropping information. The CPU 201 performs the process of step ST10, andthen ends the process in step ST7.

FIG. 14 is a diagram illustrating an example of the flag information of“cropping_normal_interpretation_flag” described in the AVC videodescriptor (AVC_video_descriptor) under the PMT inserted into a systemlayer at the time of an operation. In MPEG, the maximum insertion cycleof the PMT is 100 msec. Therefore, the insertion timing of the PMT doesnot necessarily accord with a timing of a frame of a video. Hereinafter,the description will be made on the assumption that the mode is the 3Ddisplay mode.

In the illustrated example, the image data is switched from the2-dimensional image data to the stereoscopic image data at a timing Tb.The AVC video descriptor in which the flag information of“cropping_normal_interpretation_flag” corresponding to the switchedimage data is described is acquired at a timing Ta prior to the timingTb.

Since the switched image data is the stereoscopic image data,“Frame_Packing_SEI_not_present_flag=0” and“cropping_normal_interpretation_flag=0” is set in the AVC videodescriptor (AVC_video_descriptor). However, the image data is the2-dimensional image data up to the timing Tb and the SEI of the “FramePacking Arrangement SEI message” is not detected.

That is, even when the flag information of“cropping_normal_interpretation_flag=0” is acquired, the CPU 201 doesnot specially interpret the parameter value of the cropping informationup to the timing Tb, interprets the parameter value without change, andperforms the cropping control process. Therefore, the video decoder 214correctly generates the image data SV for the 2-dimensional imagedisplay up to the timing Tb.

At the timing Tb, the SEI of “Frame Packing Arrangement SEI message” isdetected. In the illustrated example, the type information of thestereoscopic image data included in the SEI is set to “3” and the imagedata is known to be the stereoscopic image data of the side by sidescheme. The CPU 201 specially interprets the parameter value of thecropping information from the timing Tb and performs the croppingcontrol process. Therefore, the video decoder 214 correctly generatesthe image data SL and the image data SR for the stereoscopic imagedisplay from the timing Tb.

Likewise, in the illustrated example, the image data is switched fromthe stereoscopic image data to the 2-dimensional image data at a timingTd. The AVC video descriptor in which the flag information of“cropping_normal_interpretation_flag” corresponding to the switchedimage data is described is acquired at a timing Tc prior to the timingTd.

Since the switched image data is the 2-dimensional image data,“Frame_Packing_SEI_not_present_flag=1” and“cropping_normal_interpretation_flag=1” is set in the AVC videodescriptor (AVC_video_descriptor). However, the image data is thestereoscopic image data up to the timing Td and the SEI of the “FramePacking Arrangement SEI message” is detected.

That is, even when the flag information of“cropping_normal_interpretation_flag=1” is acquired, the CPU 201continues to specially interpret the parameter value of the croppinginformation up to the timing Td and performs the cropping controlprocess. Therefore, the video decoder 214 correctly generates the imagedata SL and the image data SR for the stereoscopic image display up tothe timing Td. This can be realized by storing“cropping_normal_interpretation_flag=0” in the receiver in the previousstate.

On the other hand, in FIG. 14, in order to perform correct display evenwhen the channel is switched at the timing Td, a display range can bedetermined by normally setting “cropping_normal_interpretation_flag” to“0” and causing the receiver side to interpret the parameter value ofthe cropping information.

When the image data is the stereoscopic image data of the side by sidescheme, the receiver side performs the interpretation as follows. Thatis, when the cutout region can be determined to be the left half, theinterpretation is performed as“frame_crop_right_offset=frame_crop_right_offset*2” by substituting theright-hand side into the left-hand side. Further, when the cutout regioncan be determined to be the right half, the interpretation is performedas “frame_crop_left_offset=0” by substituting the right-hand side intothe left-hand side.

When the image data is the stereoscopic image data of the top and bottomscheme, the receiver side performs the interpretation as follows. Thatis, when the cutout region can be determined to be top half, theinterpretation is performed as“frame_crop_bottom_offset=frame_crop_bottom_offset*2” by substitutingthe right-hand side into the left-hand side. Further, when the cutoutregion can be determined to be the bottom half, the interpretation isperformed as “frame_crop_top_offset=0” by substituting the right-handside into the left-hand side.

Alternatively, when the interpretation of the parameter value of thecropping information is set for each event, the realization can be madeby the above-described arrangement, as in FIG. 8( b), that is, theinsertion of “Cropping_interpretation_descriptor” under the EIT.

At the timing Td, the SEI of “Frame Packing Arrangement SEI message” isnot detected. The CPU 201 interprets the parameter value of the croppinginformation without change from the timing Td and performs the croppingcontrol process. Therefore, the video decoder 214 correctly generatesthe image data SV for 2-dimensional image display from the timing Td.

A process of the receiver 200 will be described in brief. A televisionbroadcast signal input to the antenna terminal 210 is supplied to thedigital tuner 211. The digital tuner 211 processes the televisionbroadcast signal and outputs a predetermined transport stream TScorresponding to the user's selected channel.

The demultiplexer 213 extracts each elementary stream of an audio and avideo from the transport stream TS obtained from the digital tuner 211.The demultiplexer 213 extracts information such as the program map table(PMT) from the transport stream TS and supplies this information to theCPU 201. This information includes the flag information of“cropping_normal_interpretation_flag” serving as the interpretationinformation of the parameter value of the cropping information.

The video stream extracted from the demultiplexer 213 is supplied to thevideo decoder 214. The video decoder 214 can obtain decoded image data(2-dimensional image data or stereoscopic image data) obtained byperforming the decoding process on the encoded image data included inthe video stream. The image data is image data of 1920 pixels×1088 linesto which 8 lines formed from blank data are added. The video decoder 214extracts the header information of the video data stream and suppliesthe header information to the CPU 201. The header information includesthe cropping information or the SEI of “Frame Packing Arrangement SEImessage.”

The CPU 201 controls the cropping of the video decoder 214 based on thecropping information, the interpretation information of the parametervalue, the SEI including the type information of the stereoscopic imagedata, and the like. In this case, the CPU 201 interprets the parametervalue of the cropping information without change, when the image data isthe 2-dimensional image data.

The CPU 201 interprets the parameter value of the cropping informationwithout change in the 2D display mode, when the image data is thestereoscopic image data. Further, the CPU 201 interprets the croppinginformation such that the cropping region is doubled in the horizontaldirection or the vertical direction in the 3D display mode, when theimage data is the stereoscopic image data.

The video decoder 214 performs the image data cutout process (cropping)on the decoded image data based on the interpreted cropping informationunder the control of the CPU 201. Further, the video decoder 214appropriately performs the scaling process on the cut image data togenerate the display image data.

Here, the video decoder 214 performs the following process, when theimage data is the 2-dimensional image data. That is, the video decoder214 cuts out the image data of 1920 pixels×1080 lines including theactual image data from the decoded image data of 1920 pixels×1088 linesand generates the image data SV for 2-dimensional image display.

The video decoder 214 performs the following process, when the imagedata is stereoscopic image data and is in the 2-dimensional displaymode. That is, the video decoder 214 cuts out left-eye image data orright-eye image data from the decoded image data of 1920 pixels×1088lines in the image data of 1920 pixels×1080 lines including the actualimage data. Then, the video decoder 214 performs the scaling process onthe cut image data to generate image data SV for 2-dimensional imagedisplay.

The video decoder 214 performs the following process, when the imagedata is stereoscopic image data and is in the stereoscopic display mode.That is, the video decoder 214 cuts out the image data of 1920pixels×1080 lines including the actual image data from the decoded imagedata of 1920 pixels×1088 lines. The video decoder 214 halves the cutimage data into left and right image data or top and bottom image dataand performs the scaling process on each of the image data to generatethe left-eye image data SL and the right-eye image data SR forstereoscopic image display.

The image data SV for two-dimensional image display generated by thevideo decoder 214 and the left-eye image data SL for the stereoscopicimage display are output to the image output unit such as a display viathe view buffer 217L. Further, the right-eye image data SR forstereoscopic image display generated by the video decoder 214 is outputto the image output unit such as a display via the view buffer 217R.

The audio stream extracted by the demultiplexer 213 is supplied to theaudio decoder 218. The audio decoder 218 performs the decoding processon the encoded audio data included in the audio stream to obtain decodedaudio data. The audio data is supplied to the channel processing unit219. The channel processing unit 219 processes the audio data togenerate audio data SA of each channel used to realize, for example, a5.1 ch surround. The audio data SA is output to an audio output unitsuch as a speaker.

As described above, the CPU 201 of the receiver 200 illustrated in FIG.12 appropriately interprets the cropping information inserted into theheader portion of the video stream based on the interpretationinformation of the parameter value of the cropping information insertedinto the high-order layer of the video stream. Then, based on theinterpretation result, the CPU 201 controls the image data cutoutprocess (cropping) performed by the video decoder 214. Accordingly, evenwhen the image data is any one of the 2-dimensional image data and thestereoscopic image data, the video decoder 214 can appropriately performthe image data cutout process, and thus can correctly generate thedisplay image data.

In the receiver 200 illustrated in FIG. 12, the CPU 201 acquires theinterpretation information changed according to the switched image databefore the switching timing of the image data. However, theinterpretation of the parameter value of the cropping information basedon the interpretation information is reflected immediately after theimage data is actually switched. Accordingly, the image data cutoutprocess can appropriately be performed by the interpretation of theparameter value of the cropping information suitable for the switchedimage data immediately from the switching timing. Further, even when theacquisition of the interpretation information is not synchronized withthe switching timing of the image data, it is possible to prevent anunnatural image from being displayed.

Here, a case will be described in which the transport stream Ts from thebroadcast station 100 in the image transmission and reception system 10illustrated in FIG. 1 is received by a legacy 2D receiver (2D TV). Inthis case, the legacy 2D receiver skips the interpretation informationof the parameter value of the cropping information inserted into thehigh-order layer of the video stream. Therefore, the interpretationinformation rarely affects the cropping process in the 2D receiver.

2. Modification Examples

In the above-described embodiment, the example has been described inwhich the flag information of “cropping_normal_interpretation_flag” isdescribed as the interpretation information in the descriptor insertedunder the video elementary loop of the program map table. Instead of theflag information, mode information of “cropping_interpretation_mode” tobe described in detail below can be considered to be described asinterpretation information in the descriptor.

FIG. 15 is a diagram illustrating an example of the configuration of atransport stream TS. The example of the configuration is an example inwhich the mode information of “cropping_interpretation_mode” isdescribed as the interpretation information of the parameter value ofthe cropping information in a known AVC video descriptor.

In the example of the configuration, a PES packet “Video PES” of a videostream is included. In the video stream, when the included image data isstereoscopic image data, as described above, “Frame Packing ArrangementSEI message” is inserted into a portion of the SEIs of the access unit.The SEI includes type information indicating which transmission formatof stereoscopic image data the image data has.

The transport stream TS includes a PMT (Program Map Table) as PSI(Program Specific Information). The PSI is information that describes towhich program each elementary stream included in the transport streambelongs. The transport stream also includes an EIT (Event InformationTable) as SI (Serviced Information) used to manage an event unit.

A program descriptor describing information regarding the entire programis present in the PMT. Further, an elementary loop having informationregarding each elementary stream is present in the PMT. In the exampleof the configuration, a video elementary loop (Video ES loop) ispresent.

In the elementary loop, information such as a packet identifier (PID) isarranged for each stream and a descriptor describing informationregarding the elementary stream is also arranged. In the example of theconfiguration, an audio is not illustrated to simplify the drawing.

In the example of the configuration, mode information of“cropping_interpretation_mode” is described in “AVC_video_descriptor”included in the video elementary loop (Video ES loop).

FIG. 16( a) is a diagram illustrating an example of anotherconfiguration of the transport stream TS. The example of theconfiguration is an example in which mode information of“cropping_interpretation_mode” serving as the interpretation informationof the parameter value of the cropping information is described in anewly defined cropping interpretation descriptor.

In the example of the configuration, mode information of“cropping_interpretation_mode” is described in“Cropping_interpretation_descriptor” inserted into the video elementaryloop (Video ES loop). Although the detailed description is omitted, theremaining configuration is the same as the example of the configurationillustrated in FIG. 15.

When the interpretation of the parameter value of the croppinginformation is changed at each event,“Cropping_interpretation_descriptor” can be considered to be insertedunder the EIT, as illustrated in FIG. 16( b).

FIG. 17 is a diagram illustrating an example of the structure (Syntax)of “AVC_video_descriptor.” The descriptor itself already satisfies theH.264/AVC standard. Here, 2-bit mode information of“cropping_interpretation_mode” is newly defined in the descriptor.

As indicated in the regulation contents (semantics) in FIG. 18, the modeinformation designates interpretation of the parameter value of thecropping information defined in the SPS (Sequence Parameter Set) in thehead access unit of the GOP. When the mode information is 01,” the modeinformation indicates that the value of frame_crop_right_offset isinterpreted as being doubled. This is designed for the stereoscopicimage data of the side by side scheme. When the mode information is“10,” the mode information designates that the value offrame_crop_bottom_offset is interpreted as being doubled. This isdesigned for the stereoscopic image data of the top and bottom scheme.When the mode information is “11,” the mode information designates theinterpretation in which the parameter value of the cropping informationis interpreted without change.

FIG. 19 is a diagram illustrating an example of the configuration(Syntax) of “Cropping_interpretation_descriptor.” An 8-bit field of“descriptor_tag” indicates that the descriptor is“Cropping_interpretation_descriptor.” An 8-bit field of“descriptor_length” indicates the number of bytes of subsequent data.Further, 2-bit mode information of “cropping_interpretation_mode”described above is described in the descriptor.

The video decoder 214 of the receiver 200 performs the same processunder the control of the CPU 201, even when the mode information of“cropping_interpretation_mode” is used instead of the flag informationof “cropping_normal_interpretation_flag.”

That is, the video decoder 214 performs the following process, when theimage data is 2-dimensional image data. That is, the video decoder 214cuts out the image data of 1920 pixels×1080 lines including the actualimage data from the decoded image data of 1920 pixels×1088 lines togenerate the image data SV for 2-dimensional image display.

The video decoder 214 performs the following process, when the imagedata is stereoscopic image data and is in the 2-dimensional displaymode. That is, the video decoder 214 cuts out left-eye image data orright-eye image data from the decoded image data of 1920 pixels×1088lines in the image data of 1920 pixels×1080 lines including the actualimage data. Then, the video decoder 214 performs the scaling process onthe cut image data to generate image data SV for 2-dimensional imagedisplay.

The video decoder 214 performs the following process, when the imagedata is stereoscopic image data and is in the stereoscopic display mode.That is, the video decoder 214 cuts out the image data of 1920pixels×1080 lines including the actual image data from the decoded imagedata of 1920 pixels×1088 lines. The video decoder 214 halves the cutimage data into left and right image data or top and bottom image dataand performs the scaling process on each of the image data to generateleft-eye image data SL and right-eye image data SR for stereoscopicimage display.

FIG. 20 is a flowchart illustrating an example of a cropping controlprocess of the CPU 201 when the mode information of“cropping_interpretation_mode” is used. The CPU 201 performs a processof the flowchart for each picture. The CPU 201 starts the process instep ST11, and then causes the process to proceed to step ST12. In stepST12, the CPU 201 determines whether a mode is the 3D display mode. Theuser operates the RC transmission unit 206 to set the 3D display mode orthe 2D display mode.

When the mode is the 3D display mode, in step ST13, the CPU 201determines whether the mode information of“cropping_interpretation_mode” is “01.” When the mode information is“01,” in step ST14, the CPU 201 determines whether the SEI of “FramePacking Arrangement SEI message” is detected. The SEI is present, whenthe image data is the stereoscopic image data. When the SEI is detected,in step ST15, the CPU 201 determines whether(frame_crop_right_offset−frame_crop_left_offset) accords with ½ of thesize (horizontal_size) of the picture in the horizontal direction.

When the image data is the stereoscopic image data of the side by sidescheme, the condition of step ST15 is satisfied. Therefore, when thecondition of step ST15 is satisfied, the CPU 201 causes the process toproceed to step ST16. In step ST16, the CPU 201 interprets the croppinginformation and performs a cropping control process such that thecropping region is doubled in the horizontal direction.

In this case, the CPU 201 changes the parameter value of the croppinginformation as follows depending on whether the region cut out based onthe original cropping information is the left half or the right half.That is, when the region is the left half, the cropping control processis performed as “frame_crop_right_offset=frame_crop_right_offset*2”.Conversely, when the region is the right half, the cropping controlprocess is performed as “frame_crop_left_offset=0”.

The CPU 201 performs the process of step ST16, and then ends the processin step ST17.

When the mode is not the 3D display mode in step ST12, the SEI is notdetected in step ST14, and the condition of step ST15 is not satisfied,the CPU 201 causes the process to proceed to step ST18. In step ST18,the CPU 201 performs the cropping control process without change of theparameter value of the cropping information. The CPU 201 performs theprocess of step ST18, and then ends the process in step ST17.

When the mode information is not “01” in step ST13, the CPU 201 causesthe process to proceed to step ST19. In step ST19, the CPU 201determines whether the mode information of“cropping_interpretation_mode” is “10.” When the mode information is“10,” in step ST20, the CPU 201 determines whether the SEI of “FramePacking Arrangement SEI message” is detected.

The SEI is present, when the image data is stereoscopic image data. Whenthe SEI is detected, in step ST21, the CPU 201 determines whether(frame_crop_bottom_offset−frame_crop_top_offset) accords with ½ of thesize (vertical_size) of the picture in the vertical direction.

When the image data is the stereoscopic image data of the top and bottomscheme, the condition of step ST21 is satisfied. Therefore, when thecondition of step ST21 is satisfied, the CPU 201 causes the process toproceed to step ST22. In step ST22, the CPU 201 interprets the croppinginformation such that the cropping region is doubled in the verticaldirection and performs the cropping control process.

In this case, the CPU 201 changes the parameter value of the croppinginformation as follows depending on whether the region cut out based onthe original cropping information is the top half or the bottom half.That is, when the region is the top half, the cropping control processis performed as “frame_crop_bottom_offset=frame_crop_bottom_offset*2”.Conversely, when the region is the bottom half, the cropping controlprocess is performed as “frame_crop_top_offset=0”.

The CPU 201 performs the process of step ST22, and then ends the processin step ST17.

When the mode information is not “10” in step ST19, the SEI is notdetected in step ST20, and the condition of step ST21 is not satisfied,the CPU 201 causes the process to proceed to step ST18. In step ST18,the CPU 201 performs the cropping control process without change of theparameter value of the cropping information. The CPU 201 performs theprocess of step ST18, and then ends the process in step ST17.

FIG. 21 is a diagram illustrating an example of the mode information of“cropping_interpretation_mode” described in the AVC video descriptor(AVC_video_descriptor) under the PMT inserted into a system layer at thetime of an operation. In MPEG, the maximum insertion cycle of the PMT is100 msec. Therefore, the insertion timing of the PMT does notnecessarily accord with a timing of a frame of a video. Hereinafter, thedescription will be made on the assumption that the mode is the 3Ddisplay mode.

In the illustrated example, the image data is switched from the2-dimensional image data to the stereoscopic image data at a timing Tb.The AVC video descriptor in which the mode information of“cropping_interpretation_mode” corresponding to the switched image datais described is acquired at a timing Ta prior to the timing Tb.

Since the switched image data is the stereoscopic image data,“Frame_Packing_SEI_not_present_flag=0” and“cropping_interpretation_mode=01” is set in the AVC video descriptor(AVC_video_descriptor). However, the image data is the 2-dimensionalimage data up to the timing Tb and the SEI of the “Frame PackingArrangement SEI message” is not detected.

That is, even when the mode information of“cropping_interpretation_mode=01” is acquired, the CPU 201 does notinterpret the value of frame_crop_right_offset as being doubled up tothe timing Tb, interprets the value without change, and performs thecropping control process. Therefore, the video decoder 214 correctlygenerates the image data SV for the 2-dimensional image display up tothe timing Tb.

At the timing Tb, the SEI of “Frame Packing Arrangement SEI message” isdetected. In the illustrated example, the type information of thestereoscopic image data included in the SEI is set to “3” and the imagedata is known to be the stereoscopic image data of the side by sidescheme. The CPU 201 interprets the value of frame_crop_right_offset asbeing doubled from the timing Tb and performs the cropping controlprocess. Therefore, the video decoder 214 correctly generates the imagedata SL and the image data SR for the stereoscopic image display fromthe timing Tb.

Likewise, in the illustrated example, the image data is switched fromthe stereoscopic image data to the 2-dimensional image data at a timingTd. The AVC video descriptor in which the mode information of“cropping_interpretation_mode” corresponding to the switched image datais described is acquired at a timing Tc prior to the timing Td.

Since the switched image data is the 2-dimensional image data,“Frame_Packing_SEI_not_present_flag=1” and“cropping_interpretation_mode=11” is set in the AVC video descriptor(AVC_video_descriptor). However, the image data is the stereoscopicimage data up to the timing Td and the SEI of the “Frame PackingArrangement SEI message” is detected.

That is, even when the flag information of“cropping_interpretation_mode=11” is acquired, the CPU 201 continuouslyinterpret the value of frame_crop_right_offset as being doubled up tothe timing Td and performs the cropping control process. Therefore, thevideo decoder 214 correctly generates the image data SL and the imagedata SR for the stereoscopic image display up to the timing Td. This canbe realized by storing “cropping_interpretation_mode”=“01” or “10” inthe receiver in the previous state.

On the other hand, in FIG. 21, in order to perform correct display evenwhen the channel is switched at the timing Td, a display range can bedetermined by normally setting “cropping_interpretation_mode” to “01” or“10” and causing the receiver side to interpret the parameter value ofthe cropping information.

When the image data is the stereoscopic image data of the side by sidescheme, the receiver side performs the interpretation as follows. Thatis, when the cutout region can be determined to be the left half, theinterpretation is performed as“frame_crop_right_offset=frame_crop_right_offset*2” by substituting theright-hand side into the left-hand side. Further, when the cutout regioncan be determined to be the right half, the interpretation is performedas “frame_crop_left_offset=0” by substituting the right-hand side intothe left-hand side.

When the image data is the stereoscopic image data of the top and bottomscheme, the receiver side performs the interpretation as follows. Thatis, when the cutout region can be determined to be top half, theinterpretation is performed as“frame_crop_bottom_offset=frame_crop_bottom_offset*2” by substitutingthe right-hand side into the left-hand side. Further, when the cutoutregion can be determined to be the bottom half, the interpretation isperformed as “frame_crop_top_offset=0” by substituting the right-handside into the left-hand side.

Alternatively, when the interpretation of the parameter value of thecropping information is set for each event, the realization can be madeby the above-described arrangement, as in FIG. 16( b), that is, theinsertion of “Cropping_interpretation_descriptor” under the EIT.

At the timing Td, the SEI of “Frame Packing Arrangement SEI message” isnot detected. The CPU 201 interprets the parameter value of the croppinginformation without change from the timing Td and performs the croppingcontrol process. Therefore, the video decoder 214 correctly generatesthe image data SV for 2-dimensional image display from the timing Td.

Thus, even when the mode information of “cropping_interpretation_mode”is described as the interpretation information in the descriptor, thereceiver 200 can perform the same process as the process of theabove-described embodiment. That is, even in this case, it is possibleto obtain the same advantages as those of the above-describedembodiment.

In the above-described embodiment, the example has been described inwhich the image data is subjected to the encoding of H.264/AVC. However,for example, the image data may be subjected to another encoding ofMPEG2 video or the like. For example, the image data may be subjected tostill another encoding of HEVC (High Efficiency Video Coding) or thelike. When the encoding of MPEG2 video is performed, the typeinformation of the stereoscopic image data is inserted into, forexample, a picture header.

In the above-described embodiment, the image transmission and receptionsystem 10 including the broadcast station 100 and the receiver 200 hasbeen described. However, the configuration of an image transmission andreception system to which the present technology is applicable is notlimited thereto. For example, the receiver 200 may include a set-top boxand a monitor connected by a digital interface such as the HDMI(High-Definition Multimedia Interface).

In the above-described embodiment, the example has been described inwhich the container is the transport stream (MPEG-2TS). However, thepresent technology is likewise applicable to a system configured suchthat information is delivered to a reception terminal using a networksuch as the Internet. In the delivery of the Internet, information isdelivered with containers of MP4 or other formats in many cases. Thatis, the transport stream (MPEG-2TS) used according to the digitalbroadcast standard and containers of various formats such as MP4 used indelivery of the Internet correspond to the container.

The present technology can be configured as follows.

(1) An image data transmission device includes:

an image data transmission unit that transmits a container of apredetermined format having a video stream which includes image data andin which cropping information is inserted into a header portion; and

an information insertion unit that inserts interpretation information ofa parameter value of the cropping information into a high-order layer ofthe video stream.

(2) In the image data transmission device described in (1) above,

the interpretation information indicates that the parameter value of thecropping information is specially interpreted,

when the image data is stereoscopic image data in which left-eye imagedata and right-eye image data are divided and arranged in a horizontaldirection or a vertical direction in the same frame.

(3) In the image data transmission device described in (2) above,

the interpretation information indicates that the parameter value of thecropping information is interpreted such that a cropping region isdoubled in the horizontal direction or the vertical direction.

(4) In the image data transmission device described in any one of (1) to(3) above,

the image data is one of 2-dimensional image data and stereoscopic imagedata in which left-eye image data and right-eye image data are dividedand arranged in a horizontal direction or a vertical direction in thesame frame.

The information insertion unit inserts the interpretation informationchanged according to switched image data into a high-order layer of thevideo stream at a timing prior to a switching timing of thetwo-dimensional image data and the stereoscopic image data.

(5) In the image data transmission device described in any one of (1) to(4) above, the container is a transport stream.

The information insertion unit inserts the interpretation informationunder one of a program map table and an event information table.

(6) In the image data transmission device described in (5) above,

the information insertion unit describes the interpretation informationin a descriptor inserted under one of the program map table and theevent information table.

(7) In the image data transmission device described in (6) above,

the video stream is encoded data of one of H.264/AVC and HEVC.

The cropping information is defined in a sequence parameter set of thevideo stream.

The information insertion unit describes the interpretation informationin the descriptor inserted under one of the program map table and theevent information table.

(8) An image data transmission method includes:

an image data transmission step of transmitting a container of apredetermined format having a video stream which includes image data andin which cropping information is inserted into a header portion;

and an information insertion step of inserting interpretationinformation of a parameter value of the cropping information into ahigh-order layer of the video stream.

(9) An image data reception device includes

an image data reception unit that receives a container of apredetermined format having a video stream which includes image data andin which cropping information is inserted into a header portion.

Interpretation information of a parameter value of the croppinginformation is inserted into a high-order layer of the video stream.

The image data reception device further includes an informationacquisition unit that acquires the interpretation information from thecontainer;

a decoding unit that decodes the video stream included in the containerto acquire the image data and the cropping information;

and an image data processing unit that interprets the parameter value ofthe cropping information based on the interpretation information andcuts out image data of a predetermined region from the image data togenerate display image data.

(10) In the image data reception device described in (10) above,

the image data is one of 2-dimensional image data and stereoscopic imagedata in which left-eye image data and right-eye image data are dividedand arranged in a horizontal direction or a vertical direction in thesame frame.

At a timing prior to a switching timing of the two-dimensional imagedata and the stereoscopic image data, the interpretation informationchanged according to the switched image data is inserted into ahigh-order layer of the video stream.

From the switching timing of the image data, the image data processingunit interprets the parameter value of the cropping information based onthe interpretation information inserted at a timing prior to theswitching timing and changed according to the switched image data.

(11) An image data reception method includes:

an image data reception step of receiving a container of a predeterminedformat having a video stream which includes image data and in whichcropping information is inserted into a header portion.

Interpretation information of a parameter value of the croppinginformation is inserted into a high-order layer of the video stream.

The image data reception method further includes an informationacquisition step of acquiring the interpretation information from thecontainer;

a decoding step of decoding the video stream included in the containerto acquire the image data and the cropping information;

and an image data processing step of interpreting the parameter value ofthe cropping information based on the interpretation information andcutting out image data of a predetermined region from the image data togenerate display image data.

As the main characteristics of the present technology, when a transportstream (container) of a predetermined format having a video stream inwhich cropping information is inserted into a header portion istransmitted, an image data cutout process (cropping) using the croppinginformation on the reception side can be normally performedappropriately by inserting interpretation information of a parametervalue of the cropping information into a high-order layer of the videostream (see FIGS. 4 and 5).

REFERENCE SIGNS LIST

-   -   10 IMAGE TRANSMISSION AND RECEPTION SYSTEM    -   100 BROADCAST STATION    -   110 TRANSMISSION DATA GENERATION UNIT    -   111 DATA EXTRACTION UNIT    -   111 a DATA RECORDING MEDIUM    -   112 VIDEO ENCODER    -   113 AUDIO ENCODER    -   114 MULTIPLEXER    -   200 RECEIVER    -   201 CPU    -   202 FLASH ROM    -   203 DRAM    -   204 INTERNAL BUS    -   205 REMOTE CONTROL RECEPTION UNIT (RC RECEPTION UNIT)    -   206 REMOTE CONTROL TRANSMISSION UNIT (RC TRANSMISSION UNIT)    -   210 ANTENNA TERMINAL    -   211 DIGITAL TUNER    -   213 DEMULTIPLEXER    -   214 VIDEO DECODER    -   217L, 217R VIEW BUFFER    -   218 AUDIO DECODER    -   219 CHANNEL PROCESSING UNIT

1. An image data transmission device comprising: an image datatransmission unit that transmits a container of a predetermined formathaving a video stream which includes image data and in which croppinginformation is inserted into a header portion; and an informationinsertion unit that inserts interpretation information of a parametervalue of the cropping information into a high-order layer of the videostream.
 2. The image data transmission device according to claim 1,wherein the interpretation information indicates that the parametervalue of the cropping information is specially interpreted, when theimage data is stereoscopic image data in which left-eye image data andright-eye image data are divided and arranged in a horizontal directionor a vertical direction in the same frame.
 3. The image datatransmission device according to claim 2, wherein the interpretationinformation indicates that the parameter value of the croppinginformation is interpreted such that a cropping region is doubled in thehorizontal direction or the vertical direction.
 4. The image datatransmission device according to claim 1, wherein the image data is oneof 2-dimensional image data and stereoscopic image data in whichleft-eye image data and right-eye image data are divided and arranged ina horizontal direction or a vertical direction in the same frame, andwherein the information insertion unit inserts the interpretationinformation changed according to switched image data into a high-orderlayer of the video stream at a timing prior to a switching timing of thetwo-dimensional image data and the stereoscopic image data.
 5. The imagedata transmission device according to claim 1, wherein the container isa transport stream, and wherein the information insertion unit insertsthe interpretation information under one of a program map table and anevent information table.
 6. The image data transmission device accordingto claim 5, wherein the information insertion unit describes theinterpretation information in a descriptor inserted under one of theprogram map table and the event information table.
 7. The image datatransmission device according to claim 6, wherein the video stream isencoded data of one of H.264/AVC and HEVC, wherein the croppinginformation is defined in a sequence parameter set of the video stream,and wherein the information insertion unit describes the interpretationinformation in the descriptor inserted under one of the program maptable and the event information table.
 8. An image data transmissionmethod comprising: an image data transmission step of transmitting acontainer of a predetermined format having a video stream which includesimage data and in which cropping information is inserted into a headerportion; and an information insertion step of inserting interpretationinformation of a parameter value of the cropping information into ahigh-order layer of the video stream.
 9. An image data reception devicecomprising: an image data reception unit that receives a container of apredetermined format having a video stream which includes image data andin which cropping information is inserted into a header portion, whereininterpretation information of a parameter value of the croppinginformation is inserted into a high-order layer of the video stream, andwherein the image data reception device further includes: an informationacquisition unit that acquires the interpretation information from thecontainer; a decoding unit that decodes the video stream included in thecontainer to acquire the image data and the cropping information; and animage data processing unit that interprets the parameter value of thecropping information based on the interpretation information and cutsout image data of a predetermined region from the image data to generatedisplay image data.
 10. The image data reception device according toclaim 9, wherein the image data is one of 2-dimensional image data andstereoscopic image data in which left-eye image data and right-eye imagedata are divided and arranged in a horizontal direction or a verticaldirection in the same frame, wherein at a timing prior to a switchingtiming of the two-dimensional image data and the stereoscopic imagedata, the interpretation information changed according to the switchedimage data is inserted into a high-order layer of the video stream, andwherein from the switching timing of the image data, the image dataprocessing unit interprets the parameter value of the croppinginformation based on the interpretation information inserted at a timingprior to the switching timing and changed according to the switchedimage data.
 11. An image data reception method comprising: an image datareception step of receiving a container of a predetermined format havinga video stream which includes image data and in which croppinginformation is inserted into a header portion, wherein interpretationinformation of a parameter value of the cropping information is insertedinto a high-order layer of the video stream, and wherein the image datareception method further includes: an information acquisition step ofacquiring the interpretation information from the container; a decodingstep of decoding the video stream included in the container to acquirethe image data and the cropping information; and an image dataprocessing step of interpreting the parameter value of the croppinginformation based on the interpretation information and cutting outimage data of a predetermined region from the image data to generatedisplay image data.